45 research outputs found

    A discontinuity in pattern inference

    Get PDF
    This paper examines the learnability of a major subclass of E-pattern languages – also known as erasing or extended pattern languages – in Gold’s learning model: We show that the class of terminal-free E-pattern languages is inferrable from positive data if the corresponding terminal alphabet consists of three or more letters. Consequently, the recently presented negative result for binary alphabets is unique

    Discontinuities in pattern inference

    Get PDF
    This paper deals with the inferrability of classes of E-pattern languages—also referred to as extended or erasing pattern languages—from positive data in Gold’s model of identification in the limit. The first main part of the paper shows that the recently presented negative result on terminal-free E-pattern languages over binary alphabets does not hold for other alphabet sizes, so that the full class of these languages is inferrable from positive data if and only if the corresponding terminal alphabet does not consist of exactly two distinct letters. The second main part yields the insight that the positive result on terminal-free E-pattern languages over alphabets with three or four letters cannot be extended to the class of general E-pattern languages. With regard to larger alphabets, the extensibility remains open. The proof methods developed for these main results do not directly discuss the (non-)existence of appropriate learning strategies, but they deal with structural properties of classes of E-pattern languages, and, in particular, with the problem of finding telltales for these languages. It is shown that the inferrability of classes of E-pattern languages is closely connected to some problems on the ambiguity of morphisms so that the technical contributions of the paper largely consist of combinatorial insights into morphisms in word monoids

    On the learnability of E-pattern languages over small alphabets

    Get PDF
    This paper deals with two well discussed, but largely open problems on E-pattern languages, also known as extended or erasing pattern languages: primarily, the learnability in Gold’s learning model and, secondarily, the decidability of the equivalence. As the main result, we show that the full class of E-pattern languages is not inferrable from positive data if the corresponding terminal alphabet consists of exactly three or of exactly four letters – an insight that remarkably contrasts with the recent positive finding on the learnability of the subclass of terminal-free E-pattern languages for these alphabets. As a side-effect of our reasoning thereon, we reveal some particular example patterns that disprove a conjecture of Ohlebusch and Ukkonen (Theoretical Computer Science 186, 1997) on the decidability of the equivalence of E-pattern languages

    A negative result on inductive inference of extended pattern languages

    Get PDF
    A negative result on inductive inference of extended pattern language

    On the equivalence problem for E-pattern languages over small alphabets

    Get PDF
    We contribute new facets to the discussion on the equivalence problem for E-pattern languages (also referred to as extended or erasing pattern languages). This fundamental open question asks for the existence of a computable function that, given any pair of patterns, decides whether or not they generate the same language. Our main result disproves Ohlebusch and Ukkonen’s conjecture (Theoretical Computer Science 186, 1997) on the equivalence problem; the respective argumentation, that largely deals with the nondeterminism of pattern languages, is restricted to terminal alphabets with at most four distinct letters

    A non-learnable class of E-pattern languages

    Get PDF
    We investigate the inferrability of E-pattern languages (also known as extended or erasing pattern languages) from positive data in Gold’s learning model. As the main result, our analysis yields a negative outcome for the full class of E-pattern languages – and even for the subclass of terminal-free E-pattern languages – if the corresponding terminal alphabet consists of exactly two distinct letters. Furthermore, we present a positive result for a manifest subclass of terminal-free E-pattern languages. We point out that the considered problems are closely related to fundamental questions concerning the nondeterminism of E-pattern languages

    The unambiguity of segmented morphisms

    Get PDF
    This paper studies the ambiguity of morphisms in free monoids. A morphism σ is said to be ambiguous with respect to a string α if there exists a morphism τ which differs from σ for a symbol occurring in α, but nevertheless satisfies τ(α) = σ(α); if there is no such τ then σ is called unambiguous. Motivated by the recent initial paper on the ambiguity of morphisms, we introduce the definition of a so-called segmented morphism σn, which, for any n ∈ N, maps every symbol in an infinite alphabet onto a word that consists of n distinct factors in ab+a, where a and b are different letters. For every n, we consider the set U(σn) of those finite strings over an infinite alphabet with respect to which σn is unambiguous, and we comprehensively describe its relation to any U(σm), m ≠ n. Thus, our work features the first approach to a characterisation of sets of strings with respect to which certain fixed morphisms are unambiguous, and it leads to fairly counter-intuitive insights into the relations between such sets. Furthermore, it shows that, among the widely used homogeneous morphisms, most segmented morphisms are optimal in terms of being unambiguous for a preferably large set of strings. Finally, our paper yields several major improvements of crucial techniques previously used for research on the ambiguity of morphisms

    Inferring descriptive generalisations of formal languages

    Get PDF
    In the present paper, we introduce a variant of Gold-style learners that is not required to infer precise descriptions of the languages in a class, but that must find descriptive patterns, i.e., optimal generalisations within a class of pattern languages. Our first main result characterises those indexed families of recursive languages that can be inferred by such learners, and we demonstrate that this characterisation shows enlightening connections to Angluin’s corresponding result for exact inference. Using a notion of descriptiveness that is restricted to the natural subclass of terminal-free E-pattern languages, we introduce a generic inference strategy, and our second main result characterises those classes of languages that can be generalised by this strategy. This characterisation demonstrates that there are major classes of languages that can be generalised in our model, but not be inferred by a normal Gold-style learner. Our corresponding technical considerations lead to deep insights of intrinsic interest into combinatorial and algorithmic properties of pattern languages

    Inferring descriptive generalisations of formal languages

    Get PDF
    In the present paper, we introduce a variant of Gold-style learners that is not required to infer precise descriptions of the languages in a class, but that must nd descriptive patterns, i. e., optimal generalisations within a class of pattern languages. Our rst main result characterises those indexed families of recursive languages that can be inferred by such learners, and we demonstrate that this characterisation shows enlightening connections to Angluin's corresponding result for exact inference. Furthermore, this result reveals that our model can be interpreted as an instance of a natural extension of Gold's model of language identi cation in the limit. Using a notion of descriptiveness that is restricted to the natural subclass of terminal-free E-pattern languages, we introduce a generic inference strategy, and our second main result characterises those classes of languages that can be generalised by this strategy. This characterisation demonstrates that there are major classes of languages that can be generalised in our model, but not be inferred by a normal Gold-style learner. Our corresponding technical considerations lead to insights of intrinsic interest into combinatorial and algorithmic properties of pattern languages
    corecore